智能论文笔记

Towards Unconstrained Audio Splicing Detection and Localization with Neural Networks

Denise Moussa , Germans Hirsch , Christian Riess

分类：人工智能 | 计算机视觉

2022-07-29

免费可用且易于使用的音频编辑工具使执行音频剪接变得直接。可以通过结合同一人的各种语音样本来说服伪造。在考虑错误信息时，在公共部门都很重要，并且在法律背景下以验证证据的完整性很重要。不幸的是，用于音频剪接的大多数现有检测算法都使用手工制作的功能并做出特定的假设。但是，刑事调查人员经常面临来自未知特征不明的来源的音频样本，这增加了对更普遍适用的方法的需求。通过这项工作，我们的目标是朝着不受限制的音频剪接检测迈出第一步，以满足这一需求。我们以可能掩盖剪接的后处理操作的形式模拟各种攻击方案。我们提出了一个用于剪接检测和定位的变压器序列到序列（SEQ2SEQ）网络。我们的广泛评估表明，所提出的方法的表现优于现有的剪接检测方法[3，10]以及通用网络效率网络[28]和regnet [25]。

translated by 谷歌翻译

CLID: Controlled-Length Image Descriptions with Limited Data

Elad Hirsch , Ayellet Tal

分类：计算机视觉

2022-11-27

Controllable image captioning models generate human-like image descriptions, enabling some kind of control over the generated captions. This paper focuses on controlling the caption length, i.e. a short and concise description or a long and detailed one. Since existing image captioning datasets contain mostly short captions, generating long captions is challenging. To address the shortage of long training examples, we propose to enrich the dataset with varying-length self-generated captions. These, however, might be of varying quality and are thus unsuitable for conventional training. We introduce a novel training strategy that selects the data points to be used at different times during the training. Our method dramatically improves the length-control abilities, while exhibiting SoTA performance in terms of caption quality. Our approach is general and is shown to be applicable also to paragraph generation.

translated by 谷歌翻译

Tracking by weakly-supervised learning and graph optimization for whole-embryo C. elegans lineages

Peter Hirsch , Caroline Malin-Mayor , Anthony Santella , Stephan Preibisch , Dagmar Kainmueller , Jan Funke

分类：计算机视觉 | 机器学习

2022-08-24

在嘈杂和致密的荧光显微镜数据中跟踪胚胎的所有核是一项具有挑战性的任务。我们建立在最新的核跟踪方法的基础上，该方法结合了弱监督的学习，从一小部分核中心点注释与整数线性程序（ILP）结合了最佳的细胞谱系提取。我们的工作专门解决了秀丽隐杆线虫胚胎记录的以下具有挑战性的特性：（1）与其他生物的基准记录相比，许多细胞分裂以及（2）很容易被误认为是细胞核的极性体。为了应付（1），我们设计并纳入了学习的细胞分裂检测器。为了应付（2），我们采用了学到的极性身体探测器。我们进一步提出了通过结构化的SVM调整自动化的ILP权重，从而减轻了对各自的网格搜索进行乏味的手动设置的需求。我们的方法的表现优于Fluo-N3DH-CE胚胎数据集上细胞跟踪挑战的先前领导者。我们报告了另外两个秀丽隐杆线虫数据集的进一步广泛的定量评估。我们将公开这些数据集作为未来方法开发的扩展基准。我们的结果表明，我们的方法产生了可观的改进，尤其是在分区事件检测的正确性以及完全正确的轨道段的数量和长度方面。代码：https：//github.com/funkelab/linajea

translated by 谷歌翻译

HTML版本

Asymptotic properties of one-layer artificial neural networks with sparse connectivity

Christian Hirsch , Matthias Neumann , Volker Schmidt

分类： (统计)机器学习

2021-12-01

用于同时增加具有稀疏连接的一层人工神经网络的实证分布的大量规律，同时增加了随机梯度下降的两种，神经元和训练迭代。

translated by 谷歌翻译

Machine Learning for Stuttering Identification: Review, Challenges and Future Directions

Shakeel Ahmad Sheikh , Md Sahidullah , Fabrice Hirsch , Slim Ouni

分类：机器学习

2021-07-08

口吃是一种言语障碍，在此期间，语音流被非自愿停顿和声音重复打断。口吃识别是一个有趣的跨学科研究问题，涉及病理学，心理学，声学和信号处理，使检测很难且复杂。机器和深度学习的最新发展已经彻底彻底改变了语音领域，但是对口吃的识别受到了最小的关注。这项工作通过试图将研究人员从跨学科领域聚集在一起来填补空白。在本文中，我们回顾了全面的声学特征，基于统计和深度学习的口吃/不足分类方法。我们还提出了一些挑战和未来的指示。

translated by 谷歌翻译

PatchPerPix for Instance Segmentation

Peter Hirsch , Lisa Mais , Dagmar Kainmueller

分类：计算机视觉

2020-01-21

We present a novel method for proposal free instance segmentation that can handle sophisticated object shapes which span large parts of an image and form dense object clusters with crossovers. Our method is based on predicting dense local shape descriptors, which we assemble to form instances. All instances are assembled simultaneously in one go. To our knowledge, our method is the first non-iterative method that yields instances that are composed of learnt shape patches. We evaluate our method on a diverse range of data domains, where it defines the new state of the art on four benchmarks, namely the ISBI 2012 EM segmentation benchmark, the BBBC010 C. elegans dataset, and 2d as well as 3d fluorescence microscopy data of cell nuclei. We show furthermore that our method also applies to 3d light microscopy data of Drosophila neurons, which exhibit extreme cases of complex shape clusters

translated by 谷歌翻译

EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis

Mehdi S. M. Sajjadi , Bernhard Schölkopf , Michael Hirsch

分类：

2016-12-23

Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce over-smoothed images that lack highfrequency textures and do not look natural despite yielding high PSNR values.We propose a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.

translated by 谷歌翻译